Effect of Within- and Between-Speaker Variability in Voice Quality on Speaker Recognition
نویسنده
چکیده
The variability in voice quality is a critical factor in most of speech-related applications, but studies regarding this variability are scarce due to the absence of an adequate database. Based on the newly developing database, this study examines the effect of withinand between-speaker variability on speaker recognition systems. The preliminary results with a subset of the database show that voice quality variability can cause a significant degradation in speaker identification performance. It is also found that widely-used features, such as cepstral coefficients, are not robust in the presence of the voice quality change, but that the robustness can be improved by fusing with perceptually important voice quality features. Further studies will include analyzing the whole database, improving voice quality measurement, and developing a model to represent an individual’s vocal identity.
منابع مشابه
Speaker Identity and Voice Quality: Modeling Human Responses and Automatic Speaker Recognition
Despite recent breakthroughs in automatic speaker recognition (ASpR), system performance still degrades when utterances are short and/or when within-speaker variability is large. This study used short test utterances (2-3sec) to investigate the effect of within-speaker variability on state-of-the-art ASpR system performance. A subset of a newly-developed UCLA database is used, which contains mu...
متن کاملLong Term Examination of Intra-session and Inter- Session Speaker Variablity
Session variability in speaker recognition is a well recognized phenomena, but poorly understood largely due to a dearth of robust longitudinal data. The current study uses a large, longterm speaker database to quantify both speaker variability changes within a conversation and the impact of speaker variability changes over the long term (3 years). Results demonstrate that 1) change in accuracy...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملLong term examination of intra-session and inter-session speaker variability
Session variability in speaker recognition is a well recognized phenomena, but poorly understood largely due to a dearth of robust longitudinal data. The current study uses a large, longterm speaker database to quantify both speaker variability changes within a conversation and the impact of speaker variability changes over the long term (3 years). Results demonstrate that 1) change in accuracy...
متن کاملUsing Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems
Due to within-speaker variability in phonetic content and/or speaking style, the performance of automatic speaker verification (ASV) systems degrades especially when the enrollment and test utterances are short. This study examines how different types of variability influence performance of ASV systems. Speech samples (< 2 sec) from the UCLA Speaker Variability Database containing 5 different r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015